Skip to content

Conversation

@inbelic
Copy link
Contributor

@inbelic inbelic commented Jan 6, 2025

- Implement the dispatch behaviour of Parse and ParseRootElement in ParseHLSLRootSignature.cpp
- Define the general in-memory datastructure of a RootElement that will be a union of the various RootElement types

- Implement the ParseRootFlags methods in ParseHLSLRootSignature
- Define the in-memory represenation of the RootFlag and adds it to the RootElement structure
- Add testing of valid inputs to ParseHLSLRootSignatureTest.cpp

- Define the Parser class that will contain all the parsing methods in
ParseHLSLRootSignature.h
- Implement the dispatch behaviour of Parse and ParseRootElement in
ParseHLSLRootSignature.cpp
- Define the general in-memory datastructure of a RootElement that will
be a union of the various RootElement types

- Implement the ParseRootFlags methods in ParseHLSLRootSignature
- Define the in-memory represenation of the RootFlag and adds it to the
RootElement structure
- Add testing of valid inputs to ParseHLSLRootSignatureTest.cpp
@inbelic inbelic marked this pull request as ready for review January 6, 2025 17:25
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" HLSL HLSL Language Support labels Jan 6, 2025
@llvmbot
Copy link
Member

llvmbot commented Jan 6, 2025

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-hlsl

Author: Finn Plummer (inbelic)

Changes
- Implement the dispatch behaviour of Parse and ParseRootElement in ParseHLSLRootSignature.cpp
- Define the general in-memory datastructure of a RootElement that will be a union of the various RootElement types

- Implement the ParseRootFlags methods in ParseHLSLRootSignature
- Define the in-memory represenation of the RootFlag and adds it to the RootElement structure
- Add testing of valid inputs to ParseHLSLRootSignatureTest.cpp

Part of the work for #120472


Full diff: https://github.com/llvm/llvm-project/pull/121799.diff

6 Files Affected:

  • (added) clang/include/clang/Sema/ParseHLSLRootSignature.h (+63)
  • (modified) clang/lib/Sema/CMakeLists.txt (+1)
  • (added) clang/lib/Sema/ParseHLSLRootSignature.cpp (+139)
  • (modified) clang/unittests/Sema/CMakeLists.txt (+1)
  • (added) clang/unittests/Sema/ParseHLSLRootSignatureTest.cpp (+58)
  • (added) llvm/include/llvm/Frontend/HLSL/HLSLRootSignature.h (+88)
diff --git a/clang/include/clang/Sema/ParseHLSLRootSignature.h b/clang/include/clang/Sema/ParseHLSLRootSignature.h
new file mode 100644
index 00000000000000..7d1799e22b515c
--- /dev/null
+++ b/clang/include/clang/Sema/ParseHLSLRootSignature.h
@@ -0,0 +1,63 @@
+//===--- ParseHLSLRootSignature.h -------------------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+//  This file defines the ParseHLSLRootSignature interface.
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_CLANG_SEMA_PARSEHLSLROOTSIGNATURE_H
+#define LLVM_CLANG_SEMA_PARSEHLSLROOTSIGNATURE_H
+
+#include "llvm/ADT/SmallVector.h"
+#include "llvm/ADT/StringRef.h"
+#include "llvm/ADT/StringSwitch.h"
+
+#include "llvm/Frontend/HLSL/HLSLRootSignature.h"
+
+namespace llvm {
+namespace hlsl {
+namespace root_signature {
+
+class Parser {
+public:
+  Parser(StringRef Signature, SmallVector<RootElement> *Elements)
+      : Buffer(Signature), Elements(Elements) {}
+
+  // Consumes the internal buffer as a list of root elements and will
+  // emplace their in-memory representation onto the back of Elements.
+  //
+  // It will consume until it successfully reaches the end of the buffer,
+  // or until the first error is encountered. The return value denotes if
+  // there was a failure.
+  bool Parse();
+
+private:
+  bool ReportError();
+
+  // RootElements parse methods
+  bool ParseRootElement();
+
+  bool ParseRootFlags();
+  // Enum methods
+  template <typename EnumType>
+  bool ParseEnum(SmallVector<std::pair<StringLiteral, EnumType>> Mapping,
+                 EnumType &Enum);
+  bool ParseRootFlag(RootFlags &Flag);
+
+  // Internal state used when parsing
+  StringRef Buffer;
+  SmallVector<RootElement> *Elements;
+
+  StringRef Token;
+};
+
+} // namespace root_signature
+} // namespace hlsl
+} // namespace llvm
+
+#endif // LLVM_CLANG_SEMA_PARSEHLSLROOTSIGNATURE_H
diff --git a/clang/lib/Sema/CMakeLists.txt b/clang/lib/Sema/CMakeLists.txt
index 719c3a9312ec15..7141bb42eb4363 100644
--- a/clang/lib/Sema/CMakeLists.txt
+++ b/clang/lib/Sema/CMakeLists.txt
@@ -24,6 +24,7 @@ add_clang_library(clangSema
   JumpDiagnostics.cpp
   MultiplexExternalSemaSource.cpp
   ParsedAttr.cpp
+  ParseHLSLRootSignature.cpp
   Scope.cpp
   ScopeInfo.cpp
   Sema.cpp
diff --git a/clang/lib/Sema/ParseHLSLRootSignature.cpp b/clang/lib/Sema/ParseHLSLRootSignature.cpp
new file mode 100644
index 00000000000000..e4592ea1937178
--- /dev/null
+++ b/clang/lib/Sema/ParseHLSLRootSignature.cpp
@@ -0,0 +1,139 @@
+#include "clang/Sema/ParseHLSLRootSignature.h"
+
+namespace llvm {
+namespace hlsl {
+namespace root_signature {
+
+// TODO: Hook up with Sema to properly report semantic/validation errors
+bool Parser::ReportError() { return true; }
+
+bool Parser::ParseRootFlags() {
+  // Set to RootFlags::None and skip whitespace to catch when we have RootFlags(
+  // )
+  RootFlags Flags = RootFlags::None;
+  Buffer = Buffer.drop_while(isspace);
+  StringLiteral Prefix = "";
+
+  // Loop until we reach the end of the rootflags
+  while (!Buffer.starts_with(")")) {
+    // Trim expected | when more than 1 flag
+    if (!Buffer.consume_front(Prefix))
+      return ReportError();
+    Prefix = "|";
+
+    // Remove any whitespace
+    Buffer = Buffer.drop_while(isspace);
+
+    RootFlags CurFlag;
+    if (ParseRootFlag(CurFlag))
+      return ReportError();
+    Flags |= CurFlag;
+
+    // Remove any whitespace
+    Buffer = Buffer.drop_while(isspace);
+  }
+
+  // Create and push the root element on the parsed elements
+  Elements->push_back(RootElement(Flags));
+  return false;
+}
+
+template <typename EnumType>
+bool Parser::ParseEnum(SmallVector<std::pair<StringLiteral, EnumType>> Mapping,
+                       EnumType &Enum) {
+  // Retrieve enum
+  Token = Buffer.take_while([](char C) { return isalnum(C) || C == '_'; });
+  Buffer = Buffer.drop_front(Token.size());
+
+  // Try to get the case-insensitive enum
+  auto Switch = llvm::StringSwitch<std::optional<EnumType>>(Token);
+  for (auto Pair : Mapping)
+    Switch.CaseLower(Pair.first, Pair.second);
+  auto MaybeEnum = Switch.Default(std::nullopt);
+  if (!MaybeEnum)
+    return true;
+  Enum = *MaybeEnum;
+
+  return false;
+}
+
+bool Parser::ParseRootFlag(RootFlags &Flag) {
+  SmallVector<std::pair<StringLiteral, RootFlags>> Mapping = {
+      {"0", RootFlags::None},
+      {"ALLOW_INPUT_ASSEMBLER_INPUT_LAYOUT",
+       RootFlags::AllowInputAssemblerInputLayout},
+      {"DENY_VERTEX_SHADER_ROOT_ACCESS", RootFlags::DenyVertexShaderRootAccess},
+      {"DENY_HULL_SHADER_ROOT_ACCESS", RootFlags::DenyHullShaderRootAccess},
+      {"DENY_DOMAIN_SHADER_ROOT_ACCESS", RootFlags::DenyDomainShaderRootAccess},
+      {"DENY_GEOMETRY_SHADER_ROOT_ACCESS",
+       RootFlags::DenyGeometryShaderRootAccess},
+      {"DENY_PIXEL_SHADER_ROOT_ACCESS", RootFlags::DenyPixelShaderRootAccess},
+      {"ALLOW_STREAM_OUTPUT", RootFlags::AllowStreamOutput},
+      {"LOCAL_ROOT_SIGNATURE", RootFlags::LocalRootSignature},
+      {"DENY_AMPLIFICATION_SHADER_ROOT_ACCESS",
+       RootFlags::DenyAmplificationShaderRootAccess},
+      {"DENY_MESH_SHADER_ROOT_ACCESS", RootFlags::DenyMeshShaderRootAccess},
+      {"CBV_SRV_UAV_HEAP_DIRECTLY_INDEXED",
+       RootFlags::CBVSRVUAVHeapDirectlyIndexed},
+      {"SAMPLER_HEAP_DIRECTLY_INDEXED", RootFlags::SamplerHeapDirectlyIndexed},
+      {"AllowLowTierReservedHwCbLimit",
+       RootFlags::AllowLowTierReservedHwCbLimit},
+  };
+
+  return ParseEnum<RootFlags>(Mapping, Flag);
+}
+
+bool Parser::ParseRootElement() {
+  // Define different ParserMethods to use StringSwitch for dispatch
+  enum class ParserMethod {
+    ReportError,
+    ParseRootFlags,
+  };
+
+  // Retreive which method should be used
+  auto Method = llvm::StringSwitch<ParserMethod>(Token)
+                    .Case("RootFlags", ParserMethod::ParseRootFlags)
+                    .Default(ParserMethod::ReportError);
+
+  // Dispatch on the correct method
+  switch (Method) {
+  case ParserMethod::ReportError:
+    return ReportError();
+  case ParserMethod::ParseRootFlags:
+    return ParseRootFlags();
+  }
+}
+
+// Parser entry point function
+bool Parser::Parse() {
+  StringLiteral Prefix = "";
+  while (!Buffer.empty()) {
+    // Trim expected comma when more than 1 root element
+    if (!Buffer.consume_front(Prefix))
+      return ReportError();
+    Prefix = ",";
+
+    // Remove any whitespace
+    Buffer = Buffer.drop_while(isspace);
+
+    // Retrieve the root element identifier
+    auto Split = Buffer.split('(');
+    Token = Split.first;
+    Buffer = Split.second;
+
+    // Dispatch to the applicable root element parser
+    if (ParseRootElement())
+      return true;
+
+    // Then we can clean up the remaining ")"
+    if (!Buffer.consume_front(")"))
+      return ReportError();
+  }
+
+  // All input has been correctly parsed
+  return false;
+}
+
+} // namespace root_signature
+} // namespace hlsl
+} // namespace llvm
diff --git a/clang/unittests/Sema/CMakeLists.txt b/clang/unittests/Sema/CMakeLists.txt
index 7ded562e8edfa5..f382f8b1235306 100644
--- a/clang/unittests/Sema/CMakeLists.txt
+++ b/clang/unittests/Sema/CMakeLists.txt
@@ -7,6 +7,7 @@ add_clang_unittest(SemaTests
   ExternalSemaSourceTest.cpp
   CodeCompleteTest.cpp
   GslOwnerPointerInference.cpp
+  ParseHLSLRootSignatureTest.cpp
   SemaLookupTest.cpp
   SemaNoloadLookupTest.cpp
   )
diff --git a/clang/unittests/Sema/ParseHLSLRootSignatureTest.cpp b/clang/unittests/Sema/ParseHLSLRootSignatureTest.cpp
new file mode 100644
index 00000000000000..0e7feb50871669
--- /dev/null
+++ b/clang/unittests/Sema/ParseHLSLRootSignatureTest.cpp
@@ -0,0 +1,58 @@
+//=== ParseHLSLRootSignatureTest.cpp - Parse Root Signature tests ---------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "clang/Sema/ParseHLSLRootSignature.h"
+#include "gtest/gtest.h"
+
+using namespace llvm::hlsl::root_signature;
+
+namespace {
+
+TEST(ParseHLSLRootSignature, EmptyRootFlags) {
+  llvm::StringRef RootFlagString = " RootFlags()";
+  llvm::SmallVector<RootElement> RootElements;
+  Parser Parser(RootFlagString, &RootElements);
+  ASSERT_FALSE(Parser.Parse());
+  ASSERT_EQ(RootElements.size(), (unsigned long)1);
+  ASSERT_EQ(RootFlags::None, RootElements[0].Flags);
+}
+
+TEST(ParseHLSLRootSignature, RootFlagsNone) {
+  llvm::StringRef RootFlagString = " RootFlags(0)";
+  llvm::SmallVector<RootElement> RootElements;
+  Parser Parser(RootFlagString, &RootElements);
+  ASSERT_FALSE(Parser.Parse());
+  ASSERT_EQ(RootElements.size(), (unsigned long)1);
+  ASSERT_EQ(RootFlags::None, RootElements[0].Flags);
+}
+
+TEST(ParseHLSLRootSignature, ValidRootFlags) {
+  // Test that the flags are all captured and that they are case insensitive
+  llvm::StringRef RootFlagString = " RootFlags( "
+                                   " ALLOW_INPUT_ASSEMBLER_INPUT_LAYOUT"
+                                   "| deny_vertex_shader_root_access"
+                                   "| DENY_HULL_SHADER_ROOT_ACCESS"
+                                   "| deny_domain_shader_root_access"
+                                   "| DENY_GEOMETRY_SHADER_ROOT_ACCESS"
+                                   "| deny_pixel_shader_root_access"
+                                   "| ALLOW_STREAM_OUTPUT"
+                                   "| LOCAL_ROOT_SIGNATURE"
+                                   "| deny_amplification_shader_root_access"
+                                   "| DENY_MESH_SHADER_ROOT_ACCESS"
+                                   "| cbv_srv_uav_heap_directly_indexed"
+                                   "| SAMPLER_HEAP_DIRECTLY_INDEXED"
+                                   "| AllowLowTierReservedHwCbLimit )";
+
+  llvm::SmallVector<RootElement> RootElements;
+  Parser Parser(RootFlagString, &RootElements);
+  ASSERT_FALSE(Parser.Parse());
+  ASSERT_EQ(RootElements.size(), (unsigned long)1);
+  ASSERT_EQ(RootFlags::ValidFlags, RootElements[0].Flags);
+}
+
+} // anonymous namespace
diff --git a/llvm/include/llvm/Frontend/HLSL/HLSLRootSignature.h b/llvm/include/llvm/Frontend/HLSL/HLSLRootSignature.h
new file mode 100644
index 00000000000000..a17ebffc7a6bf2
--- /dev/null
+++ b/llvm/include/llvm/Frontend/HLSL/HLSLRootSignature.h
@@ -0,0 +1,88 @@
+//===- HLSLRootSignature.h - HLSL Root Signature helper objects -----------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+///
+/// \file This file contains helper objects for working with HLSL Root
+/// Signatures.
+///
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_FRONTEND_HLSL_HLSLROOTSIGNATURE_H
+#define LLVM_FRONTEND_HLSL_HLSLROOTSIGNATURE_H
+
+#include <stdint.h>
+
+#include "llvm/Support/Endian.h"
+
+namespace llvm {
+namespace hlsl {
+namespace root_signature {
+
+// This is a copy from DebugInfo/CodeView/CodeView.h
+#define RS_DEFINE_ENUM_CLASS_FLAGS_OPERATORS(Class)                            \
+  inline Class operator|(Class a, Class b) {                                   \
+    return static_cast<Class>(llvm::to_underlying(a) |                         \
+                              llvm::to_underlying(b));                         \
+  }                                                                            \
+  inline Class operator&(Class a, Class b) {                                   \
+    return static_cast<Class>(llvm::to_underlying(a) &                         \
+                              llvm::to_underlying(b));                         \
+  }                                                                            \
+  inline Class operator~(Class a) {                                            \
+    return static_cast<Class>(~llvm::to_underlying(a));                        \
+  }                                                                            \
+  inline Class &operator|=(Class &a, Class b) {                                \
+    a = a | b;                                                                 \
+    return a;                                                                  \
+  }                                                                            \
+  inline Class &operator&=(Class &a, Class b) {                                \
+    a = a & b;                                                                 \
+    return a;                                                                  \
+  }
+
+// Various enumerations and flags
+
+enum class RootFlags : uint32_t {
+  None = 0,
+  AllowInputAssemblerInputLayout = 0x1,
+  DenyVertexShaderRootAccess = 0x2,
+  DenyHullShaderRootAccess = 0x4,
+  DenyDomainShaderRootAccess = 0x8,
+  DenyGeometryShaderRootAccess = 0x10,
+  DenyPixelShaderRootAccess = 0x20,
+  AllowStreamOutput = 0x40,
+  LocalRootSignature = 0x80,
+  DenyAmplificationShaderRootAccess = 0x100,
+  DenyMeshShaderRootAccess = 0x200,
+  CBVSRVUAVHeapDirectlyIndexed = 0x400,
+  SamplerHeapDirectlyIndexed = 0x800,
+  AllowLowTierReservedHwCbLimit = 0x80000000,
+  ValidFlags = 0x80000fff
+};
+RS_DEFINE_ENUM_CLASS_FLAGS_OPERATORS(RootFlags)
+
+// Define the in-memory layout structures
+
+struct RootElement {
+  enum class ElementType {
+    RootFlags,
+  };
+
+  ElementType Tag;
+  union {
+    RootFlags Flags;
+  };
+
+  // Constructors
+  RootElement(RootFlags Flags) : Tag(ElementType::RootFlags), Flags(Flags) {}
+};
+
+} // namespace root_signature
+} // namespace hlsl
+} // namespace llvm
+
+#endif // LLVM_FRONTEND_HLSL_HLSLROOTSIGNATURE_H

namespace hlsl {
namespace root_signature {

// This is a copy from DebugInfo/CodeView/CodeView.h
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This macro is defined as there will be additional enum class flags that will require this as well. We don't necessarily use the operators, so we could also manually define only the ones used. Or we can move the debuginfo version into a common header?

Copy link
Collaborator

@llvm-beanz llvm-beanz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason that you're just operating on a StringRef buffer instead of building out a proper tokenizer and parser? It seems to me like root signature parsing probably warrants full lexer and parser that operates on tokens rather than just sections of the substring.

}

bool Parser::ParseRootFlag(RootFlags &Flag) {
SmallVector<std::pair<StringLiteral, RootFlags>> Mapping = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for this to not be a llvm::StringSwitch instead?

@inbelic
Copy link
Contributor Author

inbelic commented Jan 6, 2025

I didn't implement the tokenizer because I found that the extra level of abstraction to be redundant/not beneficial with the StringRef operations.

Looking at the DXC implementation, the usage of the Tokenizer is either GetAndMatchToken, or, getToken for an identifier with a switch on a small subset of tokens. These are effectively just StringRef::consume_front/StringSwitch with the buffer abstracted into the Tokenizer.

Since we can just go through the buffer from left to right and construct the RootElements in place, then we will not reference a previous token, and so, defining/lexing an intermediate Token seems redundant.

What aspects are you referring to that would warrant it?

@llvm-beanz
Copy link
Collaborator

I didn't implement the tokenizer because I found that the extra level of abstraction to be redundant/not beneficial with the StringRef operations.

I disagree, and generally string operations tend to muddy up and can lead to inefficiencies.

Looking at the DXC implementation, the usage of the Tokenizer is either GetAndMatchToken, or, getToken for an identifier with a switch on a small subset of tokens. These are effectively just StringRef::consume_front/StringSwitch with the buffer abstracted into the Tokenizer.

I don't think DXC's implementation is a good reference.

Since we can just go through the buffer from left to right and construct the RootElements in place, then we will not reference a previous token, and so, defining/lexing an intermediate Token seems redundant.

What aspects are you referring to that would warrant it?

In general, we should lex tokens once, transform them to enums and associated state and move on. Let's take this root signature as an example:

DescriptorTable(CBV(b0, space=1)))

This becomes a token stream something like:

  • DescriptorTable - keyword
  • ( - lparen
  • CBV - keyword
  • ( - lparen
  • b0 - register
  • , - comma
  • space - keyword
  • = - equal
  • 1 - number
  • ) - rparen
  • ) - rparen
  • ) - rparen

Having a token representation where keywords and grammar tokens are converted to enumerations prevents having string or character operations throughout the parser. This is in line with Clang's tokenizer design, and seems like something we should also match.

Having the tokenizer also be able to pre-parse numbers and register tokens into constituent parts ensures that the lexing errors are simple to emit and occur where expected.

The lexing rules for HLSL are pretty simple. I would probably write the Lexer in an iterator pattern and just have a token iterator that walks token to token with a small copyable state. That would allow lookahead where necessary. You don't need to design this the way I would, Clang's model of the "Parser" preserving the current lexer state is also reasonable (that is more similar to how DXC implements this.

In either case, abstracting string and pointer manipulation is really important. If we add new root signature keywords I shouldn't need to add new logic for string comparisons. I would look at how Clang's TokenKinds.def defines keyword and punctuator tokens and I would look to define our parser similarly.

@inbelic inbelic marked this pull request as draft January 8, 2025 19:20
@inbelic inbelic closed this Jan 17, 2025
@inbelic inbelic deleted the inbelic/parser-root-flags branch January 17, 2025 23:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category HLSL HLSL Language Support

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

3 participants